pip install transformers datasets accelerate
Requirement already satisfied: transformers in d:\anaconda\lib\site-packages (4.49.0)
Requirement already satisfied: datasets in d:\anaconda\lib\site-packages (3.3.1)
Requirement already satisfied: accelerate in d:\anaconda\lib\site-packages (1.4.0)
Requirement already satisfied: filelock in d:\anaconda\lib\site-packages (from transformers) (3.13.1)
Requirement already satisfied: huggingface-hub<1.0,>=0.26.0 in d:\anaconda\lib\site-packages (from transformers) (0.28.1)
Requirement already satisfied: numpy>=1.17 in d:\anaconda\lib\site-packages (from transformers) (1.26.4)
Requirement already satisfied: packaging>=20.0 in d:\anaconda\lib\site-packages (from transformers) (23.1)
Requirement already satisfied: pyyaml>=5.1 in d:\anaconda\lib\site-packages (from transformers) (6.0.1)
Requirement already satisfied: regex!=2019.12.17 in d:\anaconda\lib\site-packages (from transformers) (2023.10.3)
Requirement already satisfied: requests in d:\anaconda\lib\site-packages (from transformers) (2.32.3)
Requirement already satisfied: tokenizers<0.22,>=0.21 in d:\anaconda\lib\site-packages (from transformers) (0.21.0)
Requirement already satisfied: safetensors>=0.4.1 in d:\anaconda\lib\site-packages (from transformers) (0.5.2)
Requirement already satisfied: tqdm>=4.27 in d:\anaconda\lib\site-packages (from transformers) (4.67.1)
Requirement already satisfied: pyarrow>=15.0.0 in d:\anaconda\lib\site-packages (from datasets) (19.0.0)
Requirement already satisfied: dill<0.3.9,>=0.3.0 in d:\anaconda\lib\site-packages (from datasets) (0.3.8)
Requirement already satisfied: pandas in d:\anaconda\lib\site-packages (from datasets) (2.1.4)
Requirement already satisfied: xxhash in d:\anaconda\lib\site-packages (from datasets) (3.5.0)
Requirement already satisfied: multiprocess<0.70.17 in d:\anaconda\lib\site-packages (from datasets) (0.70.16)
Requirement already satisfied: fsspec<=2024.12.0,>=2023.1.0 in d:\anaconda\lib\site-packages (from fsspec[http]<=2024.12.0,>=2023.1.0->datasets) (2023.10.0)
Requirement already satisfied: aiohttp in d:\anaconda\lib\site-packages (from datasets) (3.9.3)
Requirement already satisfied: psutil in d:\anaconda\lib\site-packages (from accelerate) (5.9.0)
Requirement already satisfied: torch>=2.0.0 in d:\anaconda\lib\site-packages (from accelerate) (2.6.0)
Requirement already satisfied: aiosignal>=1.1.2 in d:\anaconda\lib\site-packages (from aiohttp->datasets) (1.2.0)
Requirement already satisfied: attrs>=17.3.0 in d:\anaconda\lib\site-packages (from aiohttp->datasets) (23.1.0)
Requirement already satisfied: frozenlist>=1.1.1 in d:\anaconda\lib\site-packages (from aiohttp->datasets) (1.4.0)
Requirement already satisfied: multidict<7.0,>=4.5 in d:\anaconda\lib\site-packages (from aiohttp->datasets) (6.0.4)
Requirement already satisfied: yarl<2.0,>=1.0 in d:\anaconda\lib\site-packages (from aiohttp->datasets) (1.9.3)
Requirement already satisfied: typing-extensions>=3.7.4.3 in d:\anaconda\lib\site-packages (from huggingface-hub<1.0,>=0.26.0->transformers) (4.12.2)
Requirement already satisfied: charset-normalizer<4,>=2 in d:\anaconda\lib\site-packages (from requests->transformers) (2.0.4)
Requirement already satisfied: idna<4,>=2.5 in d:\anaconda\lib\site-packages (from requests->transformers) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in d:\anaconda\lib\site-packages (from requests->transformers) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in d:\anaconda\lib\site-packages (from requests->transformers) (2024.2.2)
Requirement already satisfied: networkx in d:\anaconda\lib\site-packages (from torch>=2.0.0->accelerate) (3.1)
Requirement already satisfied: jinja2 in d:\anaconda\lib\site-packages (from torch>=2.0.0->accelerate) (3.1.3)
Requirement already satisfied: sympy==1.13.1 in d:\anaconda\lib\site-packages (from torch>=2.0.0->accelerate) (1.13.1)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in d:\anaconda\lib\site-packages (from sympy==1.13.1->torch>=2.0.0->accelerate) (1.3.0)
Requirement already satisfied: colorama in d:\anaconda\lib\site-packages (from tqdm>=4.27->transformers) (0.4.6)
Requirement already satisfied: python-dateutil>=2.8.2 in d:\anaconda\lib\site-packages (from pandas->datasets) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in d:\anaconda\lib\site-packages (from pandas->datasets) (2023.3.post1)
Requirement already satisfied: tzdata>=2022.1 in d:\anaconda\lib\site-packages (from pandas->datasets) (2023.3)
Requirement already satisfied: six>=1.5 in d:\anaconda\lib\site-packages (from python-dateutil>=2.8.2->pandas->datasets) (1.16.0)
Requirement already satisfied: MarkupSafe>=2.0 in d:\anaconda\lib\site-packages (from jinja2->torch>=2.0.0->accelerate) (2.1.3)
Note: you may need to restart the kernel to use updated packages.
pip install tf-keras
Requirement already satisfied: tf-keras in d:\anaconda\lib\site-packages (2.18.0)
Requirement already satisfied: tensorflow<2.19,>=2.18 in d:\anaconda\lib\site-packages (from tf-keras) (2.18.0)
Requirement already satisfied: tensorflow-intel==2.18.0 in d:\anaconda\lib\site-packages (from tensorflow<2.19,>=2.18->tf-keras) (2.18.0)
Requirement already satisfied: absl-py>=1.0.0 in d:\anaconda\lib\site-packages (from tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (2.1.0)
Requirement already satisfied: astunparse>=1.6.0 in d:\anaconda\lib\site-packages (from tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (1.6.3)
Requirement already satisfied: flatbuffers>=24.3.25 in d:\anaconda\lib\site-packages (from tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (24.3.25)
Requirement already satisfied: gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1 in d:\anaconda\lib\site-packages (from tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (0.5.4)
Requirement already satisfied: google-pasta>=0.1.1 in d:\anaconda\lib\site-packages (from tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (0.2.0)
Requirement already satisfied: libclang>=13.0.0 in d:\anaconda\lib\site-packages (from tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (18.1.1)
Requirement already satisfied: opt-einsum>=2.3.2 in d:\anaconda\lib\site-packages (from tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (3.3.0)
Requirement already satisfied: packaging in d:\anaconda\lib\site-packages (from tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (23.1)
Requirement already satisfied: protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<6.0.0dev,>=3.20.3 in d:\anaconda\lib\site-packages (from tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (3.20.3)
Requirement already satisfied: requests<3,>=2.21.0 in d:\anaconda\lib\site-packages (from tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (2.32.3)
Requirement already satisfied: setuptools in d:\anaconda\lib\site-packages (from tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (68.2.2)
Requirement already satisfied: six>=1.12.0 in d:\anaconda\lib\site-packages (from tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (1.16.0)
Requirement already satisfied: termcolor>=1.1.0 in d:\anaconda\lib\site-packages (from tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (2.4.0)
Requirement already satisfied: typing-extensions>=3.6.6 in d:\anaconda\lib\site-packages (from tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (4.12.2)
Requirement already satisfied: wrapt>=1.11.0 in d:\anaconda\lib\site-packages (from tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (1.14.1)
Requirement already satisfied: grpcio<2.0,>=1.24.3 in d:\anaconda\lib\site-packages (from tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (1.64.1)
Requirement already satisfied: tensorboard<2.19,>=2.18 in d:\anaconda\lib\site-packages (from tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (2.18.0)
Requirement already satisfied: keras>=3.5.0 in d:\anaconda\lib\site-packages (from tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (3.7.0)
Requirement already satisfied: numpy<2.1.0,>=1.26.0 in d:\anaconda\lib\site-packages (from tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (1.26.4)
Requirement already satisfied: h5py>=3.11.0 in d:\anaconda\lib\site-packages (from tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (3.11.0)
Requirement already satisfied: ml-dtypes<0.5.0,>=0.4.0 in d:\anaconda\lib\site-packages (from tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (0.4.1)
Requirement already satisfied: tensorflow-io-gcs-filesystem>=0.23.1 in d:\anaconda\lib\site-packages (from tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (0.31.0)
Requirement already satisfied: wheel<1.0,>=0.23.0 in d:\anaconda\lib\site-packages (from astunparse>=1.6.0->tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (0.41.2)
Requirement already satisfied: rich in d:\anaconda\lib\site-packages (from keras>=3.5.0->tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (13.3.5)
Requirement already satisfied: namex in d:\anaconda\lib\site-packages (from keras>=3.5.0->tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (0.0.8)
Requirement already satisfied: optree in d:\anaconda\lib\site-packages (from keras>=3.5.0->tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (0.11.0)
Requirement already satisfied: charset-normalizer<4,>=2 in d:\anaconda\lib\site-packages (from requests<3,>=2.21.0->tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (2.0.4)
Requirement already satisfied: idna<4,>=2.5 in d:\anaconda\lib\site-packages (from requests<3,>=2.21.0->tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in d:\anaconda\lib\site-packages (from requests<3,>=2.21.0->tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in d:\anaconda\lib\site-packages (from requests<3,>=2.21.0->tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (2024.2.2)
Requirement already satisfied: markdown>=2.6.8 in d:\anaconda\lib\site-packages (from tensorboard<2.19,>=2.18->tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (3.4.1)
Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in d:\anaconda\lib\site-packages (from tensorboard<2.19,>=2.18->tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (0.7.2)
Requirement already satisfied: werkzeug>=1.0.1 in d:\anaconda\lib\site-packages (from tensorboard<2.19,>=2.18->tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (2.2.3)
Requirement already satisfied: MarkupSafe>=2.1.1 in d:\anaconda\lib\site-packages (from werkzeug>=1.0.1->tensorboard<2.19,>=2.18->tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (2.1.3)
Requirement already satisfied: markdown-it-py<3.0.0,>=2.2.0 in d:\anaconda\lib\site-packages (from rich->keras>=3.5.0->tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (2.2.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in d:\anaconda\lib\site-packages (from rich->keras>=3.5.0->tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (2.15.1)
Requirement already satisfied: mdurl~=0.1 in d:\anaconda\lib\site-packages (from markdown-it-py<3.0.0,>=2.2.0->rich->keras>=3.5.0->tensorflow-intel==2.18.0->tensorflow<2.19,>=2.18->tf-keras) (0.1.0)
Note: you may need to restart the kernel to use updated packages.
!python3 -m pip install --upgrade 'optree>=0.13.0

Load a pre-trained model and tokenizer

from transformers import AutoModelForSequenceClassification, AutoTokenizer
model_name = "bert-base-uncased"
model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=3)  # 3 classes: positive, negative, neutral
tokenizer = AutoTokenizer.from_pretrained(model_name)
D:\Anaconda\Lib\site-packages\torch\utils\_pytree.py:185: FutureWarning: optree is installed but the version is too old to support PyTorch Dynamo in C++ pytree. C++ pytree support is disabled. Please consider upgrading optree using `python3 -m pip install --upgrade 'optree>=0.13.0'`.
  warnings.warn(
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

Load data

import pandas as pd
import numpy as np
df = pd.read_csv("sentiment_product_reviews.csv")
df.head(2)
comment label
0 Moderate performance, works as intended. 1
1 The product is just okay, nothing special. 1

Convert the DataFrame to a Hugging Face Dataset

from datasets import Dataset
dataset = Dataset.from_pandas(df)
print(dataset)
Dataset({
    features: ['comment', 'label'],
    num_rows: 20000
})
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
D:\Anaconda\Lib\site-packages\torch\utils\_pytree.py:185: FutureWarning: optree is installed but the version is too old to support PyTorch Dynamo in C++ pytree. C++ pytree support is disabled. Please consider upgrading optree using `python3 -m pip install --upgrade 'optree>=0.13.0'`.
  warnings.warn(
# Define the tokenization function
def tokenize_function(examples):
    return tokenizer(examples["comment"], padding="max_length", truncation=True)
tokenized_dataset = dataset.map(tokenize_function, batched=True)
#print(tokenized_dataset[0])
{"model_id":"eec6850f7bb64fcbb476831e75489908","version_major":2,"version_minor":0}
tokenized_dataset = tokenized_dataset.train_test_split(test_size=0.2)
from transformers import AutoTokenizer, AutoModelForSequenceClassification, Trainer, TrainingArguments
model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=3)

#  Configure TrainingArguments
training_args = TrainingArguments(
    output_dir="./results",
    run_name="my_sentiment_analysis_experiment",
    eval_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=16,
    num_train_epochs=3,
    weight_decay=0.01,
    save_strategy="epoch",
    logging_dir="./logs",
    logging_steps=10,
    report_to="none",  # Disable W&B logging
)

# Initialize the Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_dataset["train"],
    eval_dataset=tokenized_dataset["test"],
)

# Train the model
trainer.train()
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[ 31/3000 23:22 < 39:53:41, 0.02 it/s, Epoch 0.03/3]
Epoch Training Loss Validation Loss